Introduction

An introduction of the data and a description of the trends/books/items you are choosing to analyze (and why!)

This data was taken from the Seattle Public Library checkout dataset which kept track of the number of checkouts per month from 2013 to 2023. For this assignment I decided to analyze checkout trends of the Throne of Glass series, Harry Potter the Sorcerer’s Stone vs The Deathly Hallows, and the trend of checkouts between authors Sarah J. Maas vs Victoria Aveyard, both of whom I read books from consistently. The Throne of Glass series is the first book series that really caught my attention. I read all 7 books in a month and read the last book (Kingdom of Ash) in a day. For Harry Potter, I wanted to see how many checkouts there were for the first book versus the last book.

Summary Information

Write a summary paragraph of findings that includes the 5 values calculated from your summary information R script

These will likely be calculated using your DPLYR skills, answering questions such as:

Feel free to calculate and report values that you find relevant.

This dataframe is extremely large, so I decided to find the number of rows, number of columns, the total number of checkouts for one of my favorite authors, the year of most checkouts and the total checkouts for the book I am currently reading. This dataframe has a total of 4224916 rows and 13 categories. For all of Sarah J. Maas, I calculated the total number of checkouts she got from 2013 to 2023. This includes the novellas of the ToG series, as well as her other books. She received a total of 54108 checkouts. The book I am currently reading is “The Invisible Life of Addie LaRue” which came out in 2020. Since then, the book received the most checkouts in 2021. The book, as of January 2023 has a total of 6429 checkouts.

The Dataset

This data was collected and published by the Seattle Public Library.

The data collected include the format, year, month, and number of checkouts per book. It also includes the title, the author(s), publishers, genre, and publication year of the book.

An artist was creating a data visualization exhibition and used public library data to create their art. To get this data they had to be collecting the data, and they had been for 10 years.

The data was initially collected as part of a data visualization exhibition but then a federal mandate for open data had been implemented and library data was one that had been decided could be collected and published to the public.

There are not very many ethical question that I could think of about the data being collected because there is no personal information tied to the collected data.

An issue with the data is the inconsistencies within the “Title” and “Creator” section. For example, while detecting for a title, “The Assassin’s Blade,” there are multiple variations of this title, even though it is by the same author. There may be an different in the categorization of the book because it is not part of the regular ToG series. Although it is the beginning of the main characters story, it is not technically “Book 1”. There are also smaller details that may be different and therefore, make it more difficult to search for the proper book. This is a similar issue with the “Creator” section. Some are listed as “Sarah J. Maas” while others are listed as “Maas, Sarah J.” For larger dataframes this cause more of an issue, but this dataframe is only part of the whole.

Your Choice

For the last chart, I compared the number of checkouts for two authors, Sarah J. Maas and Victoria Aveyard. These authors have a hold on many YA book readers. Their stories are compelling and entertaining. They know how to keep you attention. Despite their similarities in story telling, SJM clearly has a lead in total checkouts per month compared to Victoria Aveyard. The spike for SJM begins in November of 2020 which is around the time Booktok became a large thing on TikTok.